EXHIBIT B 



GENOMES 



T.A. BROWN 

Department of Biomolecular Sciences, UMIST, Manchester, M60 1QD, UK 



l 

IT 

o 

©WILEY-LISS § 



A JOHN WILEY & SONS, INC., PUBLICATION 

BEST AVAILABLE COPY New York • Chichester • Wcinheim • Brisbane • Singapore • Toronto 



Published in the United States of America, its dependent territories and Canada by John Wiley 
& Sontlc /by arrangement with BIOS Scientific Publishers Ltd, 9 Newtec Place, Magdalen 
Road, Oxford OX4 IRE, UK. 

© BIOS Scientific Publishers Ltd, 1999 
First published 1999 

All rights reserved. No part of this book may be reproduced or transmitted, in any form or by any 
means, without permission. 

A CIP catalogue record for this book is available from the British Library. 
ISBN 0-471-31618-0 



Library of Congress Cataloging-in-Publication Data 
Brown, T.A. (Terence A.) 
Genomes / T.A. Brown. 

p. cm. i 
Includes bibliographical references and index. 
ISBN 0-471-31618-0 (pbk.) 
1. Genomes. I. Title. 
QH447.B76 1999 

572.8'6-DC21 99-12241 



USA 

John Wiley & Sons Inc., 

605 Third Avenue, New York, 

NY 10158-0012, USA 



Canada 

John Wiley & Sons (Canada) Ltd, 
22 Worcester Road, Rexdale, 
Ontario M9W 1L1, Canada 



BEST AVAILABLE 

Production Editor: Fran Kingston 

Typeset by J&L Composition Ltd, Filey, North Yorkshire, UK 
Illustrations drawn by J&L Composition Ltd, Filey, North Yorkshire, UK 
Printed by The Bath Press Ltd, Bath, UK 



13.1 MUTATIONS 331 



(A) A mutation (B) Recombination events 



■ DNA moleCU ' e ' ~ \ Homologous DNA molecules - 



y< similar nucleotide sequences 



..GACAGTACGA... 
..CTGTCATGCT... 




Small-scale change 

in the nucleotide sequence 



.GACATTACGA... 
, . CTGTAATGCT . . . 



A B C D E F G 

TRANSPOSITION 



\ 



A B E CDFG 

Mutated DNA 

molecule 



Figure 13 .1 Mutation an d recombinati on^ 



(A) A mutation is a small-scale change in the nucleotide sequence of a DNA molecule. A point mutation is shown but there are 

\SH ^ oTmutations, Ascribed in the text. (B) Recombination events include exchange of segments of DNA 
7o££Z Surs during meiosis (see Figure 2.10, p. 26) and the movement of a segment from one pos.uon ,n a DNA molecule 
to another, for example by transposition (Section 13.2.3). 



replication), is carried out and regulated by enzymes 

and other proteins. 
Both mutation and recombination can have important 
effects on the cell in which they occur. A mutation in a key 
gene may cause the cell to die if it results in the protein 
coded by this gene being defective (Section 13.1.2) and 
some recombination events lead to changes in the bio- 
chemical capabilities of the cell, examples being those 
involved in immunoglobulin gene construction and yeast 
mating type switching. Other mutations and recombina- 
tion events have a less significant impact on the pheno- 
type of the cell and many have none at all. As we will see 
in Chapter 14, all mutations and recombination events 
that are not lethal have the potential to contribute to the 
evolution of the genome but for this to happen they must 
be inherited when the organism reproduces. With a single- 
celled organism such as a bacterium or yeast, all genome 
alterations that are not lethal or reversible are inherited 
by daughter cells and become permanent features of the 
lineage that descends for the original cell in which the 
alteration occurred. In a multicellular organism, only 
those events that occur in germ cells are relevant to 
genome evolution. Changes to the genomes of somatic 



cells are unimportant in an evolutionary sense, but they 
will have biological relevance if they result in a deleteri- 
ous phenotype that affects the health of the organism. 



13.1 MUTATIONS 

With mutations, the issues that we have to consider are: 
how they arise; what effects they have on the genome and 
on the organism in which the genome resides; whether it 
is possible for a cell to increase its mutation rate and 
induce programmed mutations under certain circum- 
stances; and how mutations are repaired. 

13 A A The causes of mutations 
Mutations arise in two ways: 

m Some mutations are spontaneous errors in replica- 
tion that evade the proofreading function of the 
DNA polymerases that synthesize new polynucleo- 
tides at the replication fork (Section 12.3.2). These 
mutations are called mismatches because they are 



r 



332 CHAPTER 13 • THE MOLECULAR BASIS OF GENOME EVOLUTION 



Box 13.1: Terminology for describing point 
mutations 

Point mutations are also called simple mutations or 
single-site mutations. They are sometimes described as 
substitution mutations but this risks confusion 
because to an evolutionary geneticist 'substitution' 
occurs only when a mutation becomes fixed in a popu- 
lation (see Box 1 5.5, p. 408), so every individual displays 
it, as opposed to when the mutation first appears in a 
single organism. 

Point mutations are divided into two categories: 

i Transitions are purine-to-purine or pyrimidine- 
to-pyrimidine changes: A-»G, G-»A, C->T or 
T-»C. 

Transversions are purine-to-pyrimidine or 
pyrimidine-to-purine changes: A— »C, A— »T, G— >C, 
G-»T, C->A, C->G,T->A or T->G. 



positions where the nucleotide that is inserted 
into the daughter polynucleotide does not match, 
by base-pairing, the nucleotide at the correspond- 
ing position in the template DNA (Figure 13.2A). If 
the mismatch is retained in the daughter double 
helix then one of the granddaughter molecules pro- 
duced during the next round of DNA replication will 
carry a permanent, double-stranded version of the 
mutation. 

Other mutations arise because a mutagen has reacted 
with the parent DNA, causing a structural change 
that affects the base-pairing capability of the altered 
nucleotide. Usually this alteration affects only one 
strand of the parent double helix, so only one of the 
daughter molecules carries the mutation, but two of 
the granddaughter molecules produced during the 
next round of replication will have it {Figure 132B). 



rr.«.f~«st/o 

When considered purely as a chemical reaction, comple- 
mentary base-pairing is not particularly accurate and if it 
was possible to copy a DNA template in the test tube, 
without the aid of any enzymes, then the resulting 
polynucleotide would probably have point mutations at 
5-10 positions out of every hundred. This represents an 
error rate of 5-10%, which would be completely unaccept- 
able during genome replication. The template-dependent 
DNA polymerases that carry out DNA replication must 
therefore increase the accuracy of the process by several 
orders of magnitude. This improvement is brought about 
in two ways: 

The DNA polymerase operates a nucleotide selection 
process that dramatically increases the accuracy of 
template-dependent DNA synthesis (Figure 13.3A). 
This selection process probably acts at three different 
stages during the polymerization reaction, discrim- 



ination against an incorrect nucleotide occurring 
when the nucleotide is first bound to the DNA poly 
merase, when it is shifted to the active site of the 
enzyme, and when it is attached to the 3 '-end of the 
polynucleotide that is being synthesized. 
The accuracy of DNA synthesis is increased still fur- 
ther if the DNA polymerase possesses a 3'-»S' 
exonuclease activity and so is able to remove an 
incorrect nucleotide that evades the base selection 
process and becomes attached to the 3'-end of the 
new polynucleotide (see Figure 12.UB, p. 313). This is 
called proofreading (Section 12.3.2), but the name is 
a misnomer because the process is not an active 
checking mechanism. Instead, each step in the syn- 
thesis of a polynucleotide should be viewed as .1 
competition between the polymerase and exonucle- 
ase functions of the enzyme, the polymerase usually 
winning because it is more active than the exonucle- 
ase, at least when the 3'-terminal nucleotide is base- 
paired to the template. But the polymerase activity is 
less efficient if the terminal nucleotide is not base- 
paired, the resulting pause in polymerization allow- 
ing the exonuclease activity to predominate so the 
incorrect nucleotide is removed (see Figure 13.3B). 

Escherichia coli is able to synthesize DNA with an error 
rate of only 1 per 10 7 nucleotide additions. Interestingly, 
these errors are not evenly distributed between the two 
daughter molecules, the product of lagging strand repli- 
cation being prone to about 20 times as many errors as the 
leading strand replicant. This asymmetry might indicate 
that DNA polymerase I, which is involved only in lagging 
strand replication (Section 12.3.2), has a less effective base 
selection and proofreading capability compared with DN A 
polymerase III, the main replicating enzyme (Francino 
and Ochman, 1997). 

Not all of the errors that occur during DNA synthesis 
can be blamed on the polymerase enzymes: sometimes an 
error occurs even though the enzyme adds the 'correct' 
nucleotide, the one that base-pairs with the template. This 
is because each nucleotide base can occur as either of two 
alternative tautomers, structural isomers that are in 
dynamic equilibrium. For example, thymine exists as two 
tautomers, the keto and enol forms, with individual mole- 
cules occasionally undergoing a shift from one tautomer 
to the other. The equilibrium is biased very much towards 
the keto form but every now and then the enol version of 
thymine occurs in the template DNA at the precise time 
that the replication fork is moving past. This will lead to 
an 'error 7 , because enoJ-thymine base-pairs with G rather 
than A (Figure 13.4). The same problem can occur with 
adenine, the rare imino tautomer of this base preferen- 
tially forming a pair with C, and guanine, eno/-guanine 
pairing with thymine. After replication, the rare tautomer 
will inevitably revert to its more common form, leading 
to a mismatch in the daughter double helix. 

As stated above, the error rate for DNA synthesis in 
E. coli is 1 in 10 7 . The overall error rate for replication 
of the E. coli genome is . only 1 in 10 10 to 1 in 10 11 , the 



13.1 MUTATIONS 333 



(A) An error in replication . . . G ACT T AG A A A . . . 

^* CTGAATCTTT . . 



...GACTTAGAAA.. 
r ...CTGAATCTTT... 



>* ...CTGAATCTTT... 

f V 



.GACTTAGAAA... 
.CTGAATCTTT... 



.GACTTAGAAA- 
.CTGAATCTTT. 

PARENT 
MOLECULE 



\ 



Replication 
error 

\ 

.G ACCTAGAAA. 
.CTGAATCTTT. 

DAUGHTER 
MOLECULES 



r 
v 



Mutated molecule 
..G ACCTAGAAA... 
..CTGGATCTTT... 



.GACTTAGAAA... 
.CTGAATCTTT... 

GRANDDAUGHTER 
MOLECULES 



(B) One possible effect of a mutagen 



.GACTTAGAAA... 
.CTGAATCTTT . . . 



v 



..GACTTAGAAA.. 
. .CTGAATCTTT . . 



.GACTTAGAAA.. 
.CTGAATCTTT. 



.GACTTAGAAA. 
CTGAXTCTTT. 

t 

Altered 
nucleotide 

PARENT 
MOLECULE 



Mutated molecule 
. .G ACTCAGAAA. . 
..CTGAXTCTTT.. 

DAUGHTER 
MOLECULES 



Mutated molecule 
..G ACTCAGAAA... 
.XTGAGTCTTT... 



Mutated molecule 
. .G ACTCAGAAA.. . 
..CTGAXTCTTT... 



GRANDDAUGHTER 
MOLECULES 



Examples of mutations. , 
(A) An error . - » • ^^^^^^^^^^^^^ 



334 CHAPTER 13 • THE MOLECULAR BASIS OF GENOME EVOLUTION 




uals, to' identify : the ''mutation or mutations responsible for f 
gethedls^e- stated 

; has^beeV-chara^ methods are \ 

.^■-ineeded':so "^t~tiinictans can screen many DNA samples in '^ i 



;r tion followed by., gel ieJectrpphoresis -.will'' locate the 
position of a mismatch. If the' heteroduplex ^ 
intact then no mismatch is present; if it is .cleaved then ; 
fit contains a mismatch^ die TOsition of the mutation in;: v- 
the test! DNA being indicated by the sizes of the 



U order to identify individuals who have the mutation and are , 

^ at risk of developing the^disease or passing it on to their ; y : C cleavage ( products. -Cleavage; is '.carried put by t^t^ 
children. \ r l if " * ' " ' ment wit ^ enzymes or chemicals that cut at singfer : 



>:A^ mutation can i be Identified; by DNA sequencing ' 
C^f^'sequencing is relatively /slaw and would be inappropri- ; 
- ate for screening a large' number of samples. . DNA' chip 
technology (Technical Note 2.3, p. 23) could also be 
employed/but this is not yet a widely available option. For - 
these reasons, a number of 'low technology' methods have ■ 
•• ; v b^en - devised^ T^es^ into two categories: ; ; 

.'• mutation seining te^nja^ 
information about the position of a; mutatio 
tion screening' techniques, which determine whether a- 
specific mutation* is present. •; ^ ^ 

Most scanning techniques involve analysis of the 
v heteroduplex formed between a single strand of the DNA 
. being examined and the complementary strand of :a ^ contrbl ^ 
/ DNA^tHat has the un mutated sequence: ^: r - ; < ?* ' 



stranded regions of mainly double-stranded DNA^ or • 
v - with a single-strand : specific ribonuclease such as; S I 
(see Figure 5.7, p; 94) rf the hybrid has been formed 
between the control DNA and an RNA version of the , . 

Most screening methods- for detection of specific 
^mutations make use of the ability of oligonucleotides to;dis^>;v 
tinguish between target DN As whose sequences differ at 
just one nucleotide position (see, Figure 2.7, '.p., 22): >ln 
allele-specific oligonucleotide (ASO) hybridization the 
DNA samples are. screened by i p^ing yvh^^an; d 
debtide that hybridizes only to the mutant sequence: 



• ' '{ vS? 



Mismatch position 
11 I I I It 1 1 1^ 



Test DNA 



Dot blot -DNA 
samples spotted on 
to a nylon membrane 



ASO 

hybridization 



MINIM. 



DNA containing 
the mutant sequence 



Control DNA 





Autoradiograph 



: gle mismatched 'position in .the heterodup where a base 
r ' pair has not formed. Various techniques can be used for 

• r-dk^ or not (Gottoh, • 

1997)? * / • ^ ^ ' 



i'ttiisls ; an efficient procedure^but; it is^ 
wind e d ; Th e D N A sa m p I e s are u s ual ly ^ obtai ri ecf by 'PQ 1$. of • 



Electrophoresis " or ; high 
& W^rp PLC) can 




presence 



;• •^• : :v^^;identif/ini the' difference in the mobility of the mis- is indicated by the synthesis or. otherwise of a PCR product.; 




13.1 MUTATIONS 335 



(A) Nucleotide selection 



5' 



A A -r T 



tt i i i i i l i n I i n i ». . . , 



I I I I I I I I I I I -L. J_ 



DNA polymerase 



HN C 



4 C 1 CH 



O N 
I 

keto- thymine 

Tautomeric 
shift 



v. 



OH 

^ C - C - CH ' 



(B) 'Proofreading' 



,C 1 CH 
N 



eno/-thymine 



BASE-PAIRS 
WITH G NOT A 



5' 



TT I II I I I I M H I M 



Last nucleotide is 
base-paired 
POLYMERASE WINS 



5' 

TTTTT1 I I I I I I II T 

3' - 5 ' 



Last nucleotide is not 
base-paired 

EXONUCLEASE WINS 



^ It 

I 



NHj 
I 



I 



om/no-adenine 



Tautomeric 
shift 



// 
HC o 

N* 
I 



NH 
II 



NH 
I 

CH 



f'mmo-adenine 



BASE-PAIRS 
WITH C NOT T 



replication. 



Mechanisms for ensuring the accuracy of DNA 



(A) The DNA polymerase actively selects the correct 
nucleotide to insert at each position. (B) Those errors that 
occur can be corrected by 'proofreading' if the polymerase 
has a 3 '— »5' exonuclease activity. If the last nucleotide that 
was inserted is base-paired to the template then the 
polymerase activity predominates, but if the last nucleotide is 
not base-paired then the exonuclease activity is favored. 

improvement compared to the polymerase error rate 
being due to the mismatch repair system (Section 13.1.4) 
that scans newly replicated DNA for positions where the 
bases are unpaired and hence corrects the few mistakes 
that the replication enzymes makes. The implication is 
that only one uncorrected replication error occurs every 
1000 times that the E. coli genome is copied. 



// ^c NH 
HC, || | 



keto-guanine 



Tautomeric 
shift 



// 
\9 



OH 
I 



I 



NH 2 



eno/-guanine 



BASE-PAIRS 
WITH T NOT C 



The effects of taUtomerism on base-pairing. 

In each of these three examples, the two tautomeric forms of 
the base have different pairing properties. Cytosine also has 
amino and imino tautomers but both pair with G. 



Not all errors in replication are point mutations. Aberrant 
replication can also result in small numbers of extra 
nucleotides being inserted into the polynucleotide being 
synthesized, or some nucleotides in the template not 
being copied. Insertions and deletions are often called 



frameshift mutations because when one occurs within a 
coding region it can result in a shift in the reading frame 
used for translation of the protein specified by the gene 
(see Figure 13.12, p. 342). However, it is inaccurate to use 
'frameshift' to describe all insertions and deletions 



336 CHAPTER 13 • THE MOLECULAR BASIS OF GENOME EVOLUTION 



because they can occur anywhere, not just in genes, and 
not all insertions or deletions in coding regions result in 
frameshifts: an insertion or deletion of three nucleotides, 
or multiples of three, simply adds or removes codons or 
parts of adjacent codons without affecting the reading 
frame. 

Insertion and deletion mutations can affect all parts of 
the genome but are particularly prevalent when the tem- 
plate DNA contains short repeated sequences, such as are 
found in microsatellites (Section 6.3.1). This is because 
repeated sequences can induce replication slippage, in 
which the template strand and its copy shift their relative 
positions so that part of the template is either copied 
twice or missed out. The result is that the new poly- 
nucleotide has a larger or smaller number, respectively, of 
the repeat units (Figure 23.5). This is the main reason why 
microsatellite sequences are so variable, replication slip- 
page occasionally generating a new length variant, 
adding to the collection of alleles already present in the 
population (Section 15.3.2). 

Replication slippage is probably also responsible for 
the trinucleotide repeat expansion diseases that have 
been discovered in humans in recent years (Ashley and 
Warren, 1995). Each of these neurodegenerative diseases 
is due to a relatively short series of trinucleotide repeats 
becoming elongated to two or more times its normal 
length. For example, the human Hdh gene contains the 
sequence 5'-CAG-3' repeated between 10 and 35 times in 



tandem, coding for a series of glutamines in the protein 
product. In Huntington's disease this repeat expands to a 
copy number of 36-121, increasing the length of the 
polyglutamine tract and resulting in a dysfunctional pro- 
tein. Several other human diseases are also due to expan- 
sions of polyglutamine codons (Table 23.1). Some diseases 
associated with mental retardation result from trinucleo- 
tide expansions in the leader region of a gene, giving a 
fragile site, a position where the chromosome is likely to 
break (Sutherland et al, 1998) and expansions involving 
intron and trailer regions are also known. 

How triplet expansions are generated is not precisely 
understood. The size of the insertion is much greater than 
occurs with normal replication slippage, such as is seen 
with microsatellite sequences, and once the expansion 
reaches a certain length it appears to become susceptible 
to further expansion in subsequent rounds of replication, 
leading to the disease becoming increasingly severe in 
succeeding generations. The possibility that expansion 
involves formation of hairpin loops in the DNA has been 
raised, based on the observation that only a limited num- 
ber of trinucleotide sequences are known to undergo 
expansion, and all of these sequences are GC-rich and so 
might form stable secondary structures. Studies of similar 
triplet expansions in yeast have shown that these are 
more prevalent when the RAD27 gene is inactivated 
(Freudenreich et al, 1998), an interesting observation as 
RAD27 is the yeast version of the mammalian gene for 



...CACACACACA. 
...GTCTGTGTG7. 



...CACACACACA... 
...GTGTGTGTGT... 



. . CACACACACA . 
..GTGTGTGTGT. 



5 T G i G l G i G 1 ... 

PARENT 
MOLECULE 



Result of replication 
slippage 

I 

C-A 



...CACACACACA 
..GTGTGTGTGT 

DAUGHTER 
MOLECULES 



Additional reseat unit 

f ' 

? 

...CACACACACACA. 
...GTGTGTGTGTGT. 



X. . . .CACACACACA. . . 
...GTGTGTGTGT... 



GRANDDAUGHTER 
MOLECULES 



Replication slippage. 

The diagram shows replication of a 5-unit CA repeat microsatellite. Slippage has occurred during replication of the parent 
molecule, inserting an additional repeat unit into the newly synthesized polynucleotide of one of the daughter molecules. When 
this daughter molecule replicates it gives a granddaughter molecule whose microsatellite array is one unit longer than that of tin 
original parent. 



13.1 MUTATIONS 337 



Examples of human trinucleotide repeat expansions 



Repeat sequence 
Locus Normal Mutated 



Polyglutamine expansions (all in coding regions of genes) 

Hdh (CAG) ia _ 35 (CAG) 36 _ m 

AR (CAG) M . 33 (CAG) 3fM6 

B37 (CAG)7_2s (CAG) 49 . 75 

M]DI (CAG) I2 _ 37 (CAG) 6I _*, 

SCAI (CAG)^ 39 (CAG) 4I ^, 



Associated disease 



Huntington's disease 
Spinal and bulbar muscular atrophy 
Dentatoribral-pallidoluysian atrophy 
Machado-Joseph disease 
Spinocerebellar ataxia type I 



Fragile site expansions (probably all in the untranslated leader regions of genes) 

FRAXA (CGG)^ S2 (CGG)^ 1000 Fragile X syndrome 

FRAXE (GCCL 5 (GCCW 750 Fragile XE mental retardat.on 

FRAXF (GCC)^, (GCC) 30 ^,ooo None 

F RAt IB (CGG) (I (CGG) 80 . IOO o Predisposed to Jacobsen syndrome 

FRA16A (CCG),^ 9 (CCG) IMO _ l90 o None 



Other expansions (positions described below) 

DMPK (CTG)^ (CTG) 50 . 30 oo 

FRDA (GAA)^ (GAA) >200 



Myotonic dystrophy 
Friedreich's ataxia 



For more details see Ashley and Warren (1995). The DMPK and FRDA expansions are in the trailer and I intron regions of their genes, 
respectively, and are thought to affect RNA processing (Ashley and Warren. 1 995; Campuzano et a t 1 996). There are also a few disease- 
causing mutations that involve expansions of longer sequences, such as progressive myoclonus epHepsy "used by a 
(CCCCGCCCCGCG) 2 3 to (CCCCGCCCCGCG) >I2 expansion in the promoter region of the EPMI locus (Mandel, 1997). 



FEN1, the protein involved in processing of Okazaki frag- 
ments (Section 12.3.2). This might indicate that a tri- 
nucleotide repeat expansion is caused by an aberration in 
lagging strand synthesis. 



cSiUZGUGilS 



a'isc ca v sec 



physical mutagens 

Many chemicals that occur naturally in the environment 
have mutagenic properties and these have been supple- 
mented in recent years with other chemical mutagens that 
result from human industrial activity. Physical agents 
such as radiation are also mutagenic. Most organisms are 
exposed to greater or lesser amounts of these various 
mutagens and their genomes suffer damage as a result. 

The definition of the term 'mutagen' is a chemical or phys- 
ical agent that causes mutations. This definition is important 
because it distinguishes mutagens from other types of 
environmental agent that cause damage to cells in ways 
other than by causing mutations (Table 23.2). There are 
overlaps between these categories (for example, some 
mutagens are also carcinogens) but each type of agent has 
a distinct biological effect. The definition of mutagen also 
makes a distinction between true mutagens and other 
agents that damage DNA without causing mutations, for 
example by causing breaks in DNA molecules. This type of 
damage may block replication and cause the cell to die, but 
it is not a mutation in the strict sense of the term and the 
causative agents are therefore not mutagens. 

Mutagens cause mutations in three different ways: 

a Some act as base analogs and are mistakenly used as 
substrates when new DNA is synthesized at the 
replication fork. 



Some react directly with DNA, causing structural 
changes that lead to miscopying of the template 
strand when the DNA is replicated. These structural 
changes are diverse, as we will see when we look at 
individual mutagens. 

Some mutagens act indirectly on DNA. They do not 
themselves affect DNA structure, but instead cause 
the cell to synthesize chemicals such as peroxides 
that have the direct mutagenic effect. 

The range of mutagens is so vast that it is difficult to 
devise an all-embracing classification. We will therefore 
restrict our study to the most common types. For chemi- 
cal mutagens these are as follows: 

b Base analogs are purine and pyrimidine bases that 
are similar enough to the standard bases to be 



Table \ 2.2 Categories of environmental agent that cause 
damage to living cells 



Agent 



Effect on living cells 



Carcinogen Causes cancer - the neoplastic 

transformation of eukaryotic cells 

Clastogen Causes fragmentation of chromosomes 

Mutagen Causes mutations 

Oncogen Induces tumor formation 

Teratogen Results in developmental abnormalities 



Based onTwyman (1998). 



338 CHAPTER 13 • THE MOLECULAR b^SIS OF GENOME EVOLUTION 



incorporated into nucleotides when these are synthe- 
sized by the cell. The resulting unusual nucleotides 
can then be used as substrates for DN A synthesis dur- 
ing genome replication. For example, 5-bromouracil 
(5-bU; Figure 13.6A) has the same base-pairing prop- 
erties as thymine and nucleotides containing this 
base can be added to the daughter polynucleotide at 
positions opposite A's in the template. The mutagenic 
effect arises because the equilibrium between the two 
tautomers of 5-bU is shifted more towards the rarer 
enol form than is the case with thymine. This means 



that during the next round of replication there is a rel- 
atively high chance of the polymerase encountering 
eno/-5bU, which (like eno/-thymine) pairs with G 
rather than A (Figure 13.6B). This results in a point 
mutation (Figure 13. 6C). 2-Aminopurine acts in a simi- 
lar way: it is an analog of adenine with an amino-tau- 
tomer that pairs with thymine and an immo-tautomer 
that pairs with cytosine, the imino form being less 
uncommon than immo-adenine and hence inducing T 
to C transitions during DNA replication. 
Deaminating agents also cause point mutations. A 



(A) 5-Bromouracil 



HN 



- C -c- Br 

II 



(B) Base-pairing with 5-bromouracil 




Sugar 



5-Bromouracil 
keto form 



Adenine 



O— H O 



/ 

Sugar 



fx s-ol^ 



O H — N 



\ 
H 



5-Bromouracil 
enol form 



Guanine 



(C) The mutagenic effect of 5-bromouracil 



.GAT ACT AG... 
.CTATGATC... 



.G AT ACT AG.. . 
.CTATGATC... 



Insertion of 
5-bromouracil 

..GABACTAG.. 
..CTATGATC.. 



Tautomeric shift 

\ 

...GABACTAG. 
.CTGTGATC. 



...GATACTAG. 
...CTATGATC. 



Shift back to the keto tautomer 

I 

...GABACTAG... 
...CTATGATC... 



^* ...CTGTGATC... 



...GACACTAG... 
...CTGTGATC... 
Mutated molecule 



Figure 13.6 5-BromouracilNmd its mutagenic effect. 



See the text for details. 



13.1 MUTATIONS 339 



certain amount of base anamination (removal of an 
amino group) occurs spontaneously in genomic 
DNA molecules, with the rate being increased by 
chemicals such as nitrous acid, which deaminates 
adenine, cytosine and guanine (thymine has no 
amino group and so cannot be deaminated), and 
sodium bisulfite, which acts only on cytosine. De- 
amination of guanine is not mutagenic because the 
resulting base, xanthine, blocks replication when it 
appears in the template polynucleotide. Deamina- 
tion of adenine gives hypoxanthine (Figure 13.7), 
which pairs with C rather than T, and deamination of 
cytosine gives uracil, which pairs with A rather than 
G. Deaminations of these two bases therefore result 
in point mutations when the template strand is 
copied. 

e Alkylating agents are a third type of mutagen that 
can give rise to point mutations. Chemicals such as 
ethylmethane sulfonate (EMS) and dimethylnitro- 
samine add alkyl groups to nucleotides in DNA 
molecules, as do methylating agents such as methyl 
halides which are present in the atmosphere, and the 
products of nitrite metabolism. The effect that alkyla- 
tion has depends on the position at which the 
nucleotide is modified and the type of alkyl group 
that is added. Methylations,* for example, often result 
in modified nucleotides with altered base-pairing 
properties and so lead to point mutations. Other 
alkylations block replication by forming crosslinks 
between the two strands of a DNA molecule, or by 
adding large alkyl groups that prevent progress of 
the replication complex. 
R Intercalating agents are usually associated with 
insertion mutations. The best known mutagen of this 
type is ethidium bromide, which fluoresces when 
exposed to UV radiation and so is used to reveal the 
positions of DNA bands after agarose gel electro- 
phoresis (see Technical Note 3.2, p. 43). Ethidium 
bromide and other intercalating agents are flat mole- 
cules that can slip in between base pairs in the dou- 
ble helix, slightly unwinding the helix and hence 
increasing the distance between adjacent base pairs 
(Figure 13.8). 

The most important types of physical mutagen are: 
k UV radiation of 260 run induces dimerization of 
adjacent pyrimidine bases, especially if these are 



both thymines (Figure 13.9 A), resulting in a 
cyclobutyl dimer. Other pyrimidine combinations 
also form dimers, the order of frequency being 
5'-CT-3' > 5'-TC-3' > 5'-CC-3'. Purine dimers are 
much less common. UV-induced dimerization usu- 
ally results in a deletion mutation when the modified 
strand is copied. Another type of UV-induced photo- 
product is the (6-4) lesion in which carbons num- 
ber 4 and 6 of adjacent pyrimidines become 
covalently linked (Figure 13.9B). 
jq; Ionizing radiation has various effects on DNA 
depending on the type of radiation and its intensity. 
Point, insertion and /or deletion mutations might 
arise, as well as more severe forms of DNA damage 
that prevent subsequent replication of the genome. 
Some types of ionizing radiation act directly on DNA, 
others act indirectly by stimulating the formation of 
reactive molecules such as peroxides in the cell. 

(A) Ethidium bromide 




(B)The mutagenic effect 



Ethidium 
bromide 




// 
HC 
\9 
NT 



li 

Adenine 



NH 2 
I 



N 

I 

CH 



Deamination 



N 

// 
HC 
\9 
N 



~C NH 



CH 



N 
I 

Hypoxanthine 



Figure 13.7 Hypoxanthine is a deaminated version of 
adenine. 




Figure 1 3 .8 The mutagenic e ff ect of ethidium bromide. 

(A) Ethidium bromide is a flat, plate-like molecule that is able 
to slot in between the base pairs of the double helix. (B) 
Ethidium bromide molecules are seen intercalated into the 
helix: the molecules are viewed sideways on. Note that 
intercalation results in the distance between adjacent base 
pairs being increased. 



■J 

u 



EST AVAILABLE COPY 



340 CHAPTER 13 • THE MOLECULAR BASIS OF GENOME EVOLUTION 




Photoproducts induced by UV irradiation. 

A segment of a polynucleotide containing two adjacent thymine bases is shown. (A) A thymine dimer contains two UV-induced 
covalent bonds, one linking the carbons at position 6 and the other linking the carbons at position 5. (B) The (6-4) lesion involves 
formation of a covalent bond between carbons 4 and 6 of the adjacent nucleotides. 



Heat stimulates the water-induced cleavage of the 
P-N-glycosidic bond that attaches the base to the 
sugar component of the nucleotide (Figure 13. 10 A). 
This occurs more frequently with purines rather 
than pyrimidines and results in an AP (apurinic/ 
apyrimidinic) or baseless site. The sugar-phosphate 
that is left is unstable and rapidly degrades, leaving 
a gap if the DNA molecule is double-stranded (Figure 
13.10B). This reaction is not normally mutagenic 
because cells have effective systems for repairing 
nicks (Section 13.1.4), which is reassuring when one 
considers that 10000 AP sites are generated in each 
human cell per day. Gaps do, however, lead to muta- 
tions under certain circumstances, for example in £. 
coli when the SOS response is activated, when gaps 
are filled with A's regardless of the identity of the 
nucleotide in the other strand (Section 13.1.3). 



When considering the effects of mutations we must make 
a distinction between the direct effect that a mutation has 
on the functioning of a genome and its indirect effect on 
the phenotype of the organism in which it occurs. The 
direct effect is relatively easy to assess because we can use 
our understanding of gene structure and expression to 
predict the impact that a mutation will have on genome 
function. The indirect effects are more complex because 
these relate to the phenotype of the mutated organism 
which, as described in Section 5.2.2, is often difficult to 
correlate with the activities of individual genes. 



Many mutations result in nucleotide sequence change- 
that have no effect on the functioning of the genome 




(B)The effect of hydrolysis on double-stranded DNA 

| | | | I M I I I | | | I ' H I I I I 

Hydrolysis 



Missing base 
| | | | | | | I r. I I I I I I I I I L" 



Gap 



1 I M I M i i i x . i I I I I I I I I 



._ . 1 1 I ft Thp mutagenic effect of heat- - — — " ~ " " 

gap in one strand. 



These silent mutations include virtually all of those that 
occur in extragenic DNA and in the noncoding compo- 
nents of genes and gene-related sequence* In other 
words, some 97% of the human genome (see Box 6.4, 
n 135) can be mutated without significant effect. 

Mutations in the coding regions of genes are much 
more important. First, we will look at pomt mutahons 
that change the sequence of a triplet codon. A mutation of 
this type will have one of four effects {Figure 13.11). 

* It may result in a synonymous change, the new 
codon specifying the same amino acid as the un- 



mutated codon. A synonymous change is therefore a 
silent mutation because it has no effect on the coding 
function of the genome: the mutated gene codes for 
exactly the same protein as the unmutated gene. 
It may result in a nonsynonymous change, the muta- 
tion altering the codon so that it specifies a different 
amino acid. The protein coded by the mutated gene 
therefore has a single amino acid change, which 
often has no significant effect on the biological activ- 
ity of the protein: most proteins can tolerate at least a 
few amino acid changes without noticeable effect on 
their ability to function in the cell, but changes to 



342 CHAPTER 13 • THE MOLECULAR BASIS OF GENOM E EVOLUTION 



Synonymous 
GGA 

giy 



Nonsynonymous 
ATA 
ile Nonsense 
TAA 



stop 



Readthrough 
TTA 
leu 



ATGGGCAAATATAGCATTCCATAAAAATATATA. 

' ' ' i II it 11 ii « « " ■ 

met gty lys tyr ser ile pro stop 

Figure 13.11 Effects of point mutations on the coding 
region of a gene. 



Four different effects of point mutations are shown, as 
described in the text The readthrough mutation results in 
the gene being extended beyond the end of the sequence 
shown here, the leucine codon created by the mutation being 
followed by AAA = lys, TAT = tyr and ATA = ile. See Figure 
10.7, p. 237 for the genetic code. 



some amino acids, such as those at the active site of 
an enzyme, have a greater impact. A nonsynony- 
mous change is also caUed a missense mutation. 
The mutation may convert a codon that specifies an 
amino acid into a termination codon. This is a non- 
sense mutation and it results in a shortened protein 
because translation of the rriRNA stops at this new 
termination codon rather than proceeding to the cor- 
rect termination codon which is further downstream. 



The effect that this has on the protein activity 
depends on how much of the polypeptide is lost; 
usually the effect is drastic and the protein is non- 
functional. 

s& The mutation could convert a termination codon into 
one specifying an amino acid, resulting in 
readthrough of the stop signal so the protein i* 
extended by an additional series of amino acids at it* 
C-tenrunus. Most proteins can tolerate short exten- 
sions without an effect on function, but longer exten- 
sions might interfere with folding of the protein and 
so result in reduced activity. 

Deletion and insertion mutations also have distinct 
effects on the coding capabilities of genes {Figure 13.12). II 
the number of deleted or inserted nucleotides is three or 
a multiple of three then one or more codons are removed 
or added, the resulting loss or gain of amino acids having 
varying effects on the function of the encoded protein. 
Deletions or insertions of this type are often inconsequen- 
tial but will have an impact if, for example, amino aculn 
involved in an enzyme's active site are lost, or if an inser- 
tion disrupts an important secondary structure in the pro- 
tein. On the other hand, if the number of deleted or 
inserted nucleotides is not three or a multiple of thnv 
then a frameshift results, all of the codons downstream ol 
the mutation being taken from a different reading frame 
from that used in the unmutated gene. This usually has a 
significant effect on the protein function, because «t 
greater or lesser part of the mutated polypeptide has a 
completely different sequence to the normal polypeptide 



3-Nucleotide deletion 



...ATGGGCTATAGCATTCCATAAAAATATATA... 

' " » II II II R 1 

met gly tyr ser ile pro stop 



ATGGGCAAATATAGCATTCCATAAAAATATATA... 

i ii ji ii ii ii n « 1 

met gly lys tyr ser ile pro stop 



I -Nucleotide deletion 



ATGGGAAATATAGCATTCCATAAAAATATA 

i ii ii ii it ii ii ii ii n i 

met gly asn ile ala phe his lys asn ile 



Figure 13.12 Deletion mutations. 

In the top sequence three nucleotides comprising a single codon are deleted. This shortens the resulting protein product by one 
amino acid but does not affect the rest of its sequence. In the lower section, a single nucleotide is deleted. This results in a 
frameshift so all the codons downstream of the deletion are changed, including the termination codon which is now read through. 
See Figure 1 0.7, p. 237, for the genetic code. Note that if a three-nucleotide deletion removes parts of adjacent nucleotides then 
the result is more complicated than shown here. Consider, for example, deletion of the trinucleotide GCA from the sequence 
... ATGG G C AAATAT. .. coding for Met-Gly-Lys-Tyr. The new sequence is ...ATGGAATAT..., coding for Met-GJu-Tyr. Two amino 
acids have been replaced by a single, different one. 

EST AVAILABLE COPY 



m 



13.1 MUTATIONS 343 



It is less easy to make generalizations about the effects 
of mutations that occur outside of the coding regions of 
the genome. Any protein binding site is susceptible to 
point, insertion or deletion mutations that change the 
identity or relative positioning of nucleotides involved in 
the DNA-protein interaction. These mutations therefore 
have the potential to inactivate promoters or regulatory 
sequences, with predictable consequences for gene 
expression (Figure 13.13; Sections 8.2 and 8.3). Origins of 
replication could conceivably be made nonfunctional by 
mutations that change, delete or disrupt sequences recog- 
nized by the relevant binding proteins (Section 12.3.1) but 
these possibilities are not well-documented. There is also 
little information about the potential impact on gene 
expression of mutations that affect nucleosome position- 
ing (Section 8.1:1). 

One area that has been better researched concerns 
mutations that occur in introns or at intron-exon bound- 
aries. In these regions, single point mutations will be 
important if they change nucleotides involved in the 
RNA-protein and RNA-RNA interactions that occur dur- 
ing splicing of different types of intron (Sections 9.2.3 and 
9.3.3). For example, mutation of either the G or T in the 
DNA copy of the 5' splice site of a GU-AG intron, or of 
the A or G at the 3' splice site, will disrupt splicing 
because the correct intron-exon boundary will no longer 
be recognized. This may mean that the intron is not 
removed from the pre-mRNA, but it is more likely that a 
cryptic splice site (see p. 215) will be used as arv alterna- 
tive. It is also possible for a mutation within an intron or 
an exon to create a new cryptic site that is preferred over 
a genuine splice site that is not itself mutated. Both types 
of event have the same result, relocation of the active 
splice site, leading to aberrant splicing. This might delete 
or add new amino acids into the resulting protein, or 
lead to a frameshift. Several versions of the blood disease 
P-thalassemia are caused by mutations that lead to 
cryptic splice site selection during processing of P-globin 
transcripts. 



The effects of mutations cn multicellular organisms 
Now we turn to the indirect effects that mutations have 
on organisms, beginning with multicellular, diploid 
eukaryotes such as humans. The first issue to consider is 
the relative importance of the same mutation in a somatic 
cell compared with a germ cell. Because somatic cells do 
not pass copies of their genomes to the next generation, a 
somatic cell mutation is important only for the organism 
in which it occurs: it has no potential evolutionary 
impact. In fact, most somatic cell mutations have no sig- 
nificant effect, even if they result in cell death, because 
there are many other identical cells in the same tissue and 
the loss of one cell is immaterial. An exception is when a . 
mutation causes a somatic cell to malfunction in a way 
that is harmful to the organism, for instance by inducing 
tumor formation or other cancerous activity. 

Mutations in germ cells are more important because 
they can be transmitted to members of the next genera- 
tion and will then be present in all the cells of any indi- 
vidual who inherits the mutation. Most mutations, 
including all silent ones as well as many in coding 
regions, will still not change the phenotype of the organ- 
ism in any significant way. Those that do have an effect 
can be divided into two categories: 

n Loss-of-function is the normal result of a mutation 
that reduces or abolishes a protein activity. Most loss- 
of-function mutations are recessive, because in a het- 
erozygote the second chromosome copy carries an 
unmutated version of the gene coding for a fully 
functional protein whose presence compensates for 
the effect of the mutation {Figure 13.14). There are 
some exceptions where a loss-of-function mutation is 
dominant, one example being hapl ©insufficiency, 
where the organism is unable to tolerate the approx- 
■ imately 50% reduction in protein activity suffered by 
the heterozygote. In humans this is the explanation 
of a few genetic diseases, including Marfan syn- 
drome which results from a mutation in the gene for 
the connective tissue protein called fibrillin. 



> ^1 



il 



•II 



Core promoter 



Regulatory sequence 



UNCONTROLLED 
TRANSCRIPTION 



Deletion of the 
regulatory sequence 



Deletion of the 
core promoter 



NO TRANSCRIPTION 



Figure 13.13 Two possible effects of deletion mutationjMn the region upstream of a gene. 



344 CHAPTER 13 • THE MOLECL . xR BASIS OF GENOME EVOLUTION 



Loss-of-function 
mutation 



Pair of 
homologous 
chromosomes 




Tryptophan 
auxotroph 



No protein product 



Protein product 



Figure 13.14 A loss-of-function mutation is usually 
recessive because a functional version of the gene is present 
on the second chromosome copy. 



h Gain-of-function mutations are much less common. 
The mutation must be one that confers an abnormal 
activity on a protein. Many gain-of-function muta- 
tions are in regulatory sequences rather than coding 
regions, and can therefore have a number of conse- 
quences. For example, a mutation might lead to one 
or more genes being expressed in the wrong tissues, 
these tissues gaining functions that they normally 
lack. Alternatively the mutation could lead to over- 
expression of one or more genes involved in control 
of the cell cycle, thus leading to uncontrolled cell 
division and hence to cancer. Because of their nature, 
gain-of-function mutations are usually dominant. 

There are added complications when considering the 
effects of mutations on the phenotypes of multicellular 
organisms. Not all mutations have an immediate effect on 
the organism: some are delayed-onset and only confer an 
altered phenotype later in the individual's life. Others 
display nonpenetrance in some individuals, never being 
expressed even though the individual has a dominant 
mutation or is a homozygous recessive. With humans, 
these factors complicate attempts to map disease-causing 
mutations by pedigree analysis (Section 2.3.2), because 
they introduce uncertainties regarding which members of 
a pedigree carry a mutant allele. 

The effects of mutations on microorganisms 
Mutations in microbes such as bacteria and yeast can also 
be described as loss-of-function or gain-of-function, but 
with microorganisms this is neither the normal nor most 
useful classification scheme. Instead, a more detailed 
description of the phenotype is usually attempted, based 
on the growth properties of mutated cells in various cul- 
ture media. This enables most mutations to be assigned to 
one of four categories: 

a Auxotrophs are cells that will only grow when pro- 
vided with a nutrient not required by the unmutated 
organism. For example, £. coli normally makes its 
own tryptophan, courtesy of the enzymes coded by 
the five genes in the tryptophan operon (Figure 6.14B, 
p. 133). If one of these genes is mutated in such a way 
that its protein product is inactivated, then the cell is 
no longer able to make tryptophan and so is a tryp- 
tophan auxotroph. It cannot survive on a medium 
that lacks tryptophan, being able to grow only when 



Minimal medium 
+ tryptophan 



• Cf< ■ - : • /'X 



Minimal medium 



Fig ure 1 3. 1 5 A tryptophan auxotrophic m utant. 

Two Petri-dish cultures are shown. Both contain minimal 
medium, which contains just the basic nutritional 
requirements for bacterial growth (nitrogen, carbon and 
energy sources, plus some salts). The medium on the left is 
supplemented with tryptophan but the medium on the right is 
not Unmutated bacteria, plus tryptophan auxotrophs, can 
grow on the plate on the left, the auxotrophs growing because 
the medium supplies the tryptophan that they cannot make 
themselves. Tryptophan auxotrophs cannot grow on the plate 
on the right, because this does not contain tryptophan. To 
identify a tryptophan auxotroph, colonies are first grown on 
the minimal medium + tryptophan plate and then transferred 
to the minimal medium plate by replica prating. In this 
procedure, a sterile felt pad is pressed onto the colonies on 
the minimal medium + tryptophan plate, carefully removed, 
and then pressed onto the surface of the minimal medium 
plate, transferring a few bacteria from one plate to the other. 
After incubation, colonies appear on the minimal medium 
plate in the same relative positions as on the plate containing 
tryptophan, except for the tryptophan auxotrophs which do 
not grow. These colonies can therefore be identified and 
samples of the tryptophan auxotrophic bacteria recovered 
from the minimal medium + tryptophan plate. 



this amino acid is provided as a nutrient (Figure 
13.15). Unmutated bacteria, which do not require 
extra supplements in their growth media, are called 
prototrophs. 

Conditional-lethal mutants are unable to withstand 
certain growth conditions: under permissive condi- 
tions they appear to be entirely normal but when 
transferred to restrictive conditions the mutant phe- 
notype is seen. Temperature-sensitive mutants are 
typical examples of conditional-lethals. Temperature- 
sensitive mutants behave like wild-type cells at low 
temperatures but exhibit their mutant phenotype 
when the temperature is raised above a certain 
threshold value, which is different for each mutant. 
Usually this is because the mutation reduces the sta- 
bility of a protein, so the protein becomes unfolded 
and hence inactive when the temperature is raised. 
Inhibitor-resistant mutants are able to resist the 
toxic effects of an antibiotic or other type of inhibitor. 
There are various molecular explanations for this 
type of mutant. In some cases the mutation change* 
the structure of the protein that is targeted by ih<? 
inhibitor, so the latter can no longer bind to the pro- ) 
tein and interfere with its function. This is the bosb J 
of streptomycin-resistance in E. coli, which result* -J 

BEST AVAILABLE COP 



13.1 MUTATIONS 345 



from a change in the structure of ribosomal protein 
512. Another possibility is that the mutation changes 
the properties of a protein responsible for transport- 
ing the inhibitor into the cell, this often being the way 
in which resistance to toxic metals is acquired. 
Regulatory mutants have defects in promoters and 
other regulatory sequences. This category includes 
constitutive mutants, which continually express 
genes that are normally switched on and off under 
different conditions. For example, a mutation in the 
operator sequence of the lactose operon (Section 
83.1) can prevent the repressor from binding and so 
results in the lactose operon being expressed all the 
time, even when lactose is absent and the genes 
should be switched off (Figure 13.16). 

In addition to these four categories, many mutations are 
lethal and so result in death of the mutant cell, and others 
have no effect. The latter are less common in micro- 
organisms than in higher eukaryotes, because most 
microbial genomes are relatively compact with little non- 
coding DNA. Mutations can also be leaky, meaning that 
a less extreme form of the mutant phenotype is expressed. 
For example, a leaky version of the tryptophan auxotroph 
illustrated in Figure 13.15 would grow slowly on minimal 
medium, rather than not growing at all. 



Is it possible for cells to utilize mutations in a positive 
fashion, either by increasing the rate at which mutations 
appear in their genomes, or by directing mutations 
towards specific genes? Both types of event might appear, 
at first glance, to go against the accepted wisdom that 



tacZ 



LACTOSE OPERON IS 

CONSTITUTIVELY 

EXPRESSED 



Mutation in 
the operator 



Effect of a constitutive mutation in the 
lactose operator. 

The operator sequence has been altered by a mutation and 
the lactose repressor can no longer bind to it. The result is 
that the lactose operon is transcribed all the time, even when 
lactose is absent from the medium. This is not the only way in 
which a constitutive lac mutant can arise. For example, the 
mutation could be in the gene coding for the lactose 
repressor, changing the tertiary structure of the repressor 
protein so that its DNA-binding motif is disrupted and it can 
no longer recognize the operator sequence, even when the 
latter is unmutated. See Figure 8. /5, p. 187, for more details 
about the lactose repressor and its regulatory effect on 
expression of the lactose operon. 



mutations occur randomly but, as we shall see, hyper- 
mutation and programmed mutations are possible with- 
out contravening this dogma. 

Hypermutation occurs when a cell causes the rate at 
which mutations occur in its genome to increase. This 
might appear to be an illogical thing to do as it is difficult 
to imagine situations where an increased mutation rate 
would be beneficial, and with the best studied example of 
hypermutation we do in fact know rather more about the 
process itself than about the reasons why it occurs. This is 
the SOS response of £. coli, which is induced when the 
genome of the bacterium suffers extensive damage, typi- 
cally as a result of exposure to UV radiation or chemical 
mutagens. The SOS response enables the cell to replicate 
its DNA even though the template polynucleotides con- 
tain AP sites and /or cyclobutyl dimers and other photo- 
products that wouid normally block or at least delay the 
replication complex. This requires construction of a muta- 
some, comprising several copies of the RecA protein and 
of the UmuD^C complex, the latter a trimer made up of 
two UmuD' proteins and one copy of UmuC (Goodman, 
1998). The RecA proteins coat the DNA in the region adja- 
cent to the damage position and the UmuD^C complexes 
bind to the attached RecA proteins. Somehow this enables 
DNA polymerase III to proceed past the damaged site 
and continue replicating the DNA. 

The SOS response is primarily looked on as the last 
best chance that the bacterium has to replicate its DNA 
and hence survive under adverse conditions. However, 
the price of survival is an increased mutation rate because 
the mutasome does not repair damage, it simply allows a 
damaged region of a polynucleotide to be replicated. 
When it encounters a damaged position in the template 
DNA the polymerase selects a nucleotide more or less at 
random, though with some preference for placing an A 
opposite an AP site: in effect the error rate of the replica- 
tion process becomes increased. It has been suggested 
that this increased mutation rate is the purpose of the SOS 
response, mutation for some reason or other being an 
advantageous response to DNA damage, but this idea 
remains controversial (Walker, 1995). 

Less controversial is the way in which vertebrates, 
including humans, are able to increase the mutation rate 
at one specific gene in one type of cell. This phenomenon 
takes us back to the way in which immunoglobulin diver- 
sity is generated, which we have already touched upon in 
Section 11.2.1 when we examined the genome rearrange- 
ments that result in joining of the V, D, J and H segments 
of the immunoglobulin heavy and light genes (see Figure 
11.14, p. 280). Additional diversity is produced by hyper- 
mutation of the V gene segments, after assembly of the 
intact immunoglobulin gene {Figure 23.27), the mutation 
rate for these segments being 6-7 orders of magnitude 
greater than the background mutation rate experienced 
by the rest of the genome (Shannon and Weigert, 1998). 
This enhanced mutation rate appears to result from the 
unusual behavior of the mismatch repair system which 
normally corrects replication errors. At all other positions 
within the genome, the mismatch repair system corrects 



[ 346 CHAPTER. 13 • THE MOLECULa MSIS OF GENOME EVOLUTION 




Mutations in 
theV segments 



Figure 13.17 Hypermutation of the V gen e segment o f an intact im munoglob ulin ge ne. 

See Figure / /./4,p. 280, for a description of the events leading to assembly of an immunoglobulin gene. 



errors of replication by searching for mismatches and 
replacing the nucleotide in the daughter strand, this 
being the strand that has just been synthesized and so 
contains the error (see Section 13.1.4). At V gene seg- 
ments, the repair system changes the nucleotide in the 
parent strand, and so stabilizes the mutation rather than 
correcting it (Cascalho et al, 1998). The mechanism by 
which this is achieved has not yet been described. 

An apparent increase in mutation rate arising from 
modifications to the normal DNA repair process does not 
contradict the dogma regarding the randomness of muta- 
tions. Where problems have arisen is with reports, dating 
back to 1988 (Cairns et al, 1988), suggesting that E. coli is 
able to direct mutations towards genes whose mutation 
would be advantageous under the environmental condi- 
tions that the bacterium is encountering. The original 
experiments involved a strain of E. coli that has a 
frameshift mutation in the lactose operon, inactivating 
the proteins needed for utilization of this sugar (Research 
Briefing 13.1). The bacteria were spread on an agar 
medium in which the only carbon source was lactose. 
This meant that a ceil could grow and divide only if a sec- 
ond mutation occurred in the lactose operon, restoring 
the correct reading frame and therefore allowing the lac- 
tose enzymes to be synthesized. Mutations with this 
effect appeared to occur significantly more frequently 
than expected, and at a rate that was greater than muta- 
tions in other parts of the genomes of these E. coli cells. 

These experiments suggested that bacteria can pro- 
gramme mutations according to the selective pressures 
that they are placed under. In other words, the environ- 
ment can directly affect the phenotype of the organism, as 
suggested by Lamarck, rather than operating through the 
random processes postulated by Darwin. With the impli- 
cations being so radical it is not surprising that the exper- 
iments have been debated at length with numerous 
attempts to discover flaws in their design or alternative 



explanations for the results. At present, the possibility 
that the enhanced mutation rate is not programmed 
towards specific genes but occurs throughout the genome 
is being re-examined (Bridges, 1997), and models based 
on gene amplification rather than selective mutation are 
being tested (Andersson et al., 1998). 

13. i.4 DNA repair 

In view of the thousands of damage events that genomes 
suffer every day, coupled with the errors that occur when 
the genome replicates, it is essential that cells possess effi- 
cient repair systems. Without these repair systems a 
genome would not be able to maintain its essential cellu- 
lar functions for more that a few hours before key genes 
became inactivated by DNA damage. Similarly, cell lin- 
eages would accumulate replication errors at such a rate 
that their genomes would become dysfunctional after a 
few cell divisions. 

Most cells possess five different categories of DNA 
repair system: 

m Direct repair systems, as the name suggests, act 
directly on damaged nucleotides, converting each 
one back to its original structure. 

is Base excision repair involves removal of a damaged 
nucleotide base, excision of a short piece of the 
polynucleotide around the AP site thus created, and 
resynthesis with a DNA polymerase. 

s Nucleotide excision repair is similar to base excision 
repair but is not preceded by removal of a damaged 
base and can act on more substantially damaged 
areas of DNA. 

m Mismatch repair corrects errors of replication, again 
by excising a stretch of single-stranded DNA con- 
taining the offending nucleotide and then repairing 
the resulting gap. 



3EST AVAILABLE COPY 




Adaptive mutations? : ,< ■■■ \-- : ^y^:\ 

In 1988 startling results were published suggesting that under some circumstances £s?hericH»o 

ablate '^'^^^^^^^^^^^ 
mental stress!" , ^ ' • • . ' -" . , ■ «-...•• ; ~ 



BRIEFING 



" The randomness of mutations, is an important concept in 
biology because it is a requirement of the Darwinian v.ew . ■ ■ 
of evolution; which holds that changes in the characteristics 
" of an organism .occur. by,ihance and are not in^K^..^ 
' the environment in which the cjrganism, ispiaced. Beneficial 
Klchanges are positively selected and.harmful ones are nega-, 
; lively selected (see Box I S.S, p. ; 408). In contrast, the 
y^Lamarckian theory ,of evdlu^n, .which. biologists; rejected 
v • well over a century ago., states -that organisms acquire 

* , changes that enable them to adapt to their environment. 

The Darwinian view requires that mutations occur at ran- 
dom, whereas Lamarckian evolution demands that adaptive 
mutations occur in response to the environment • ■" ' 

' '•' * ' ' : . > 

r Random mutations in E. coli ; 

The randomness of mutations in, bacteria was first demon- 

• strated by Luria and Delbriick in 1943. They grew a series 
: '*"of £ col? cultures in different flasks and then addedTl bac- 
: teriophages to each one. Most of the bacteria were killed 

by the phage, but a few T I -resistant mutants were able to 

survive. These were identified by plating samples from each 
; culture; soon after T I infection, onto ah agar medium. If 

mutations leading to T I resistance occurred randomly in 

the cultures before the bacteriophages were added, each 
; culture would contain different ; numbers of .resistant 
■ : mutants, the' numbers depending on how early during the, - 
I growth period the first mutant cells arose. Those tiiat arose 
;J® eariy would divide many times to give rise to a large num :i , 
: ber of resistant progeny in the culture at the end of the 

growth period, whereas those -that arose later would give 
rise to just a few progeny. Some cultures :wou!d;%r#foreV ;:- 
contain nwny'TI-^esistint cellsiand; others^ wouldconan;. 

. just a few. Alternatively. tf resistant bacteria arose by adap: 
W. tive mutation only when theTI phage were added^en . all,; 

• cultures would have similar numbers of mutants (see figure 



Time (h) 
0 



CULTURE I 

• Early mutation 

1 

• • 

i CULTURE 2 

• Later mutation 

i 



• • • • 

• • • • 



CULTURE 3 CULTURE 4 



1 

; 

i 



^ — 



Addt| 
phage 



1 

flllll 



RANDOM MUTATIONS ADAPTIVE MUTATIONS 
Variable numbers in Same number in each 

different cultures culture 

' • " ■■ ' *' '"' ' ' • ^'vl^ , . j * • " . **«*"'*"* 

the pre™'*" ~* \*rt™* in the medium. The', results of 
>(^rnst 

lactose ;&di* only ' ^mM^^^m^^^^WB 




containing 




yt 



CO 



sn 

o 
o 



348 CHAPTER (3 • THE MOLECULAR BASIS OF GENOME EVOLUTION 



Box 13.2: DNA repair and human disease 

The importance of DNA repair is emphasized by the 
number and severity of inherited human diseases which 
have been linked with defects in one of the repair 
processes. One of the best characterized of these is 
xeroderma pigmentosum, which results from a muta- 
tion in any one of several genes for proteins involved in 
nucleotide excision repair. The disease symptoms 
include hypersensitivity to UV radiation, so patients suf- 
fer more mutations on exposure to sunlight, this often 
leading to skin cancer (Lehmann, 1 995). Two other dis- 
eases, Cockayne syndrome and trichothiodystrophy, are 
also caused by defects in nucleotide excision repair, but 
these are more complex disorders which, although not 
involving cancer, usually include problems with both the 
skin and nervous system. 

A few diseases have been linked with defects in 
the transcription-coupled component of nucleotide 
excison repair. These include breast and ovarian can- 
cers, the BRCAI gene that confers susceptibility to 
these cancers coding for a protein that has been impli- 
cated, at least indirectly, with transcription-coupled 
repair (Gowen et a/., 1998). A deficiency in transcrip- 
tion-coupled repair has also been identified in humans 
suffering from the cancer susceptibility syndrome called 
HNPCC (hereditary nonpolyposis colorectal cancer; 
Mellon et a/., 1996) though this disease was originally 
identified as a defect in mismatch repair (Kolodner, 
1995). Other diseases that seem to involve a break- 
down in some aspect of DNA repair, but whose direct 
causes have not been uncovered, are ataxia telangiecta- 
sia, whose symptoms include sensitivity to ionizing radi- 
ation, Bloom's syndrome, which is probably due to 
inactivation of DNA ligase genes, and Fanconi's anemia, 
which confers sensitivity to chemicals that cause 
crosslinks in DNA. 



Recombination repair is used to mend double- 
strand breaks. 

In this section we will look at the first four types of repair 
system, leaving recombination repair until Box 13.4, this 
last system being easier to understand after we have dealt 
with the more general principles of recombination. 



Relatively few forms of DNA damage can be repaired 
without excision of nucleotides. Those that can be 
repaired by direct methods are as follows: 

Nicks can be repaired by a DNA ligase if all that has 
happened is that a phosphodiester bond has been 
broken, without damage to the 5 '-phosphate and 
3'-hydroxyl groups of the nucleotides either side of 
the nick (Figure 13.18). This is often the case with 
nicks resulting from the effects of ionizing radiation. 



Nick 

M M I I II I /\ \ II 1 II I 1 



i DNA ligase 

I H I I I M ■ T I I M I M I I I | 



^ Nick is repaired 
' I I □ □ I M'M □ I I I I I IT 

Repair of a nick by DNA ligase. 



Some forms of alkylation damage are directly 
reversible by enzymes that transfer the alkyl group 
from the nucleotide to their own polypeptide chains. 
Enzymes capable of doing this are known in many 
different organisms and include the Ada enzyme of 
E. coli, which is involved in an adaptive process that 
this bacterium is able to activate in response to DNA 
damage. Ada removes alkyl groups attached to the 
oxygen groups at positions 4 and 6 of thymine and 
guanine, respectively, and can also repair phospho- 
diester bonds that have become methylated. Other 
alkylation repair enzymes have more restricted 
specificities, an example being human MGMT (O 6 - 
methylguanine-DNA methyltransferase) which, as 
Hs name suggests, only removes alkyl groups from 
position 6 of guanine. 

Cyclobutyl dimers are repaired by a light-dependent 
direct system called photoreactivation. In E. coli, 
the process involves the enzyme called DNA photo- 
lyase (more correctly named deoxyribodipyrimidine 
photolyase). When stimulated by light with a wave- 
length between 300 and 500 nm the enzyme binds to 
cyclobutyl dimers and converts them back to the 
original monomeric nucleotides. Photoreactivation is 
a widespread but not universal type of repair: it is 
known in many but not all bacteria and also in quite 
a few eukaryotes, including some vertebrates, but is 
absent in humans. A similar type of photoreactiva- 
tion involves the (6-4) photoproduct photolyase and 
results in repair of (6-4) lesions. Neither E. coli nor 
humans have this enzyme but it is possessed by \\ 
variety of other organisms. 



Base excision is the least complex of the various repair 
systems that involve removal of a damaged nucleoli^ 
followed by resyn thesis of DNA to span the resulting g,i| • 
It is used to repair many modified nucleotides that rum 



13.1 MUTATIONS 349 j 

. } 



Box 13.3: Cell cycle checkpoints for 
monitoring DNA damage 

In a multicellular organism, the death of a single somatic 
cell as a result of DNA damage is usually less dangerous 
than allowing that cell to replicate its mutated DNA and 
possibly give rise to a tumor or other cancerous 
growth. Eukaryotic cells therefore monitor their 
genomes for damage, principally at the checkpoints 
immediately before the entry into the S and M phases of 
the cell cycle (Section 12.4.1; Russell, 1998). These 
checkpoints ensure that a damaged genome is not repli- 
cated, which would lead to mutations being perpetuated 
in the cell lineage, and prevent problems with the distri- 
bution of chromosomes to daughter cells, which might \ 
occur if one or more chromosomes has extensive DNA 
damage. A cell that fails to pass one or other of the 
checkpoint tests might undergo cell cycle arrest, perma- 
nently or until its DNA is repaired, or it might be forced 
into programmed cell death or apoptosis (Chernova et 
o/., 1 995; Enoch and Norbury, 1995). 

In mammals, a central player in induction of cell 
cycle arrest and apoptosis is the protein called p53. This 
is classified as a tumor-suppressor protein, because j 
when this protein is defective, cells with damaged ! 
genomes can avoid the cell cycle checkpoints and possi- 
bly proliferate into a cancer. p53 is a sequence-specific j 
DNA-binding protein that activates a number of genes j 
thought to be directly responsible for arrest and apop- 
tosis, and also represses expression of others that must 
be switched off to facilitate these processes. A second I 
protein that might play a regulatory role at the check- j 
points is the product of the human ATM gene which, j 
| when defective, gives rise to ataxia telangiectasia, one of 
the diseases associated with a deficiency in DNA repair j 
(see Box 13.2). 



suffered relatively minor damage to their bases. The 
process is initiated by a DNA glycolyase which cleaves 
the p-N-glycosidic bond between a damaged base and the 
sugar component of the nucleotide (Figure 13.19 A). Each 
DNA glycolyase has a limited specificity, the specificities 
of the glycolyases possessed by a cell detennining the 
range of damaged nucleotides that can be repaired by the 
base excision pathway. Most organisms are able to deal 
with deaminated bases such as uracil (deaminated cyto- 
sine) and hypoxanthine (deaminated adenine), oxidation 
products such as 5-hydroxycytosine and thymine glycol, 
and methylated bases such as 7-methylguanine and 
2-methylcytosine (Seeberg et al, 1995). Other DNA gly- 
colyases remove normal bases as part of the mismatch 
repair system. 

DNA glycolyase removes a damaged base by 'flipping' 
the structure to a position outside of the helix and then 
detaching it from the polynucleotide (Kunkel and Wilson, 
1996; Roberts and Cheng, 1998). This creates an AP or 
baseless site (see Figure 13.10, p. 341) which is converted 



into a single nucleotide gap in the second step of the 
repair pathway (Figure 13.19B). This step can be carried 
out in a variety of ways. The standard method makes use 
of an AP endonuclease, such as exonuclease III or 
endonuclease IV of E. coli, which cuts the phosphodiester 
bond on the 5' side of the AP site. Some AP endonucle- 
ases can also remove the sugar from the AP site, this 
being all that remains of the damaged nucleotide, but oth- 
ers lack this ability and so work in conjunction with a 
separate phosphodiesterase. An alternative pathway for 
converting the AP site into a gap utilizes the endonucle- 
ase activity possessed by some DNA glycolyases, which 
can make a cut at the 3' side of the AP site, probably at the 
same time that the damaged base is removed, followed 
again by removal of the sugar by a phosphodiesterase. 

The single nucleotide gap is rilled by a DNA poly- 
merase, using the undamaged base in the other strand of 
the DNA molecule to ensure that the correct nucleotide is 
inserted. In E. coli the gap is filled by DNA polymerase I 
and in mammals by DNA polymerase (3 (see Table 122, 
p. 313; Sobol et al., 1996). Yeast seems to be unusual in that 
it uses its main DNA replicating enzyme, DNA poly- 
merase 6, for this purpose (Seeberg et al, 1995). After 
gap filling, the final phosphodiester bond is put in place 
by a DNA ligase. 

Nucleotide excision repoir is used to correct more 
extensive types of domage 

Nucleotide excision repair has a much broader specificity 
than the base excision system and is able to deal with 
more extreme forms of damage such as intrastrand 
crosslinks and bases that have become modified by 
attachment of large chemical groups. It is also able to 
correct cyclobutyl dimers by a dark repair process, pro- 
viding those organisms that do not have the photoreact- 
ivation system with a means of repairing these dimers. 

In nucleotide excision repair, a segment of single- 
stranded DNA containing the damaged nucleotide(s) is 
excised and replaced with new DNA. The process is 
therefore similar to base excision repair except that it is 
not preceded by selective base removal and a longer 
stretch of polynucleotide is excised. The best studied 
example of nucleotide excision repair is the short patch 
process of E. coli, so called because the region of poly- 
nucleotide that is excised and subsequently 'patched' is 
relatively short, usually 12 nucleotides in length. 

Short patch repair is initiated by a multienzyme com- 
plex called the UvrABC endonuclease, sometimes also 
referred to as the 'excinuclease'. In the first stage of the 
process a trimer comprising two UvrA proteins and one 
copy of UvrB attaches to the DNA at the damaged site. 
How the site is recognized is not known but the broad 
specificity of the process indicates that individual types of 
damage are not directly detected and that the complex 
must search for a more general attribute of DNA damage 
such as distortion of the double helix. UvrA may be the 
part of the complex most involved in damage location 
because it dissociates once the site has been found and 
plays no further part in the repair process. Departure of 



35>0 CHAPTER 13 • THE MOLECc .'R BASIS OF GENOME EVOLUTION 



(A) Removal of a damaged base by DNA glycolyase 




(B) Outline of the pathway 



Damaged nucleotide 

MM I I lll^lll Ml II ITT 



^^^NA glycolyase 



(see part A) 



AP site 

M I I II I II M M M M I in 



AP endonuclease, possibly with 
a phosphodiesterase 

Single nucleotide gap 

I 1 1 it^, ii i i i i i n i 



DNA polymerase 4- 
DNA ligase 



M M I II II II M M M I I M 



Figure 13,19 Base excision repair. 



(A) Excision of a damaged nucleotide by a DNA glycolyase. (B) Schematic representation of the base excision repair pathway. 
Alternative versions of the pathway are described in the text. 



UvrA allows UvrC to bind (Figure 13.20), forming a 
UvrBC dimer that cuts the polynucleotide either side of 
the damaged site. The first cut is made by UvrB at the 
fifth phosphodiester bond downstream of the damaged 
nucleotide, and the second cut is made by UvrC at the 
eighth phosphodiester bond upstream, resulting in the 12 
nucleotide excision, though there is some variability, 



especially in the position of the UvrB cut site. The excised 
segment is then removed, usually as an intact oligonu- 
cleotide, by DNA helicase II which presumably detaches 
the segment by breaking the base pairs holding it to the 
second strand. UvrC also detaches at this stage, but UvrB 
remains in place and bridges the gap produced by the 
excision, possibly to prevent the single-stranded region 



13.1 MUTATIONS 351 



Damaged nucleotide, 
causing helix distortion 




^ UvrAB trirner attaches 




i UvrA departs 
j UvrC attaches 




Segment excision (cuts by UvrB and 
[ UvrC), removal of the single strand 
UvrB bridges the gap I by helicase II 




j DNA polymerase I 4- DNA ligase 



I M I I I I I M I I M I I I I I I I I I M 1 

Short patch nucleotide excision repair in £ 

coli. 

The damaged nucleotide is shown distorting the helix because 
this is thought to be one of the recognition signals for the 
UvrAB trirner that initiates the short patch process. See the 
text for details of the events occurring during the repair 
pathway. 



that has been exposed from base-pairing with itself, pos- 
sibly to prevent this strand from becoming damaged, or 
possibly to direct the DNA polymerase to the site that 
needs to be repaired. As in base excision repair, the gap is 
filled by DNA polymerase I and the last phosphodiester 
bond is synthesized by DNA ligase. 

E. coli also has a long patch nucleotide excision repair 
system that involves Uvr proteins but differs in that the 
piece of DNA that is excised can be anything up to 2 kb in 
length. Long patch repair has been less well studied and 
the process is not understood in detail, but it is presumed 
to work on more extensive forms of damage, possibly 
regions where groups of nucleotides, rather than just 
single bases, have become modified. The eukaryotic 
nucleotide excision repair process is also called 'long 
patch' but results in replacement of only 24-29 nucleotides 



of DNA. In fact, there is no 'short patch' system in 
eukaryotes and the name is used to distinguish the 
process from base excision repair. The system is more 
complex than in £. coli and the relevant enzymes do not 
seem to be homologs of the Uvr proteins. In humans at 
least 16 proteins are involved, with the downstream cut 
being made at the same position as in E. coli - the fifth 
phosphodiester bond - but with a more distant upstream 
cut, resulting in the longer excision. Both cuts are made 
by endonucleases that attack single-stranded DNA specif- 
ically at its junction with a double-stranded region, indi- 
cating that before the cuts are made the DNA around the 
damage site has been melted, presumably by a helicase 
(Figure 13.23). This activity is provided at least in part by 
TFIIH, one of the components of the RNA polymerase II 
initiation complex (see Table 8.3, p. 184). At first it was 
assumed that TFIIH simply has a dual role in the cell, 
functioning separately in both transcription and repair, but 
now it is thought that there is a more direct link between 
the two processes (Lehmann, 1995; Svejstrup et al, 1996). 
This view is supported by the discovery of transcription- 
coupled repair, which results in the template strands of 
genes being repaired more quickly than other parts of the 
eukaryotic genome, an observation that is entirely logical 
as these template strands contain the genome's biological 
information and maintaining their integrity is the highest 
priority for the repair systems. 

Each of the three repair systems that we have looked at so 
far recognize and act upon DNA damage caused by 
mutagens. This means that they search for abnormal 
chemical structures such as modified nucleotides, 
cyclobutyl dimers and intrastrand crosslinks. They can- 
not correct mismatches resulting from errors in replica- 
tion because the mismatched nucleotide is not abnormal 
in any way, it is simply an A, C, G or T that has been 
inserted at the wrong position. As these nucleotides look 
exactly like any other nucleotide the mismatch repair 
system that corrects replication errors has to detect not 
the mismatched nucleotide itself but the absence of base- 
pairing between the parent and daughter strands. Once it 
has found a mismatch, the repair system excises part of 
the daughter polynucleotide and fills in the gap, in a 
manner similar to what we have already seen with base 
and nucleotide excision repair. 

The scheme described above leaves one important 
question unanswered. The repair must be made in the 
daughter polynucleotide because it is in this newly syn- 
thesized strand that the error has occurred: the parent 
polynucleotide has the correct sequence. How does the 
repair process know which strand is which? In E. coli the 
answer is that the daughter strand is, at this stage, under- 
methylated and therefore can be distinguished from 
the parent polynucleotide, which has a full complement 
of methyl groups. E. coli DNA is methylated because of 
the activities of the DNA adenine methylase (Dam), 
which converts adenines to 6-methyladenines in the 
sequence 5'-GATC-3', and the DNA cytosine methylase 



352 CHAPTER 13 • THE MOLECULAR BASIS OF GENOME E VOLUTION 



Damaged nucleotide 
causing helix distortion 




Melted region 
4 



I 




^ Excision 
points 



I 



Excision of the damaged region 
3' 




| DNA polymerase + DNA ligase 

M I i I I 11 I I I M I I M U I I I I M I I I (HI 

Figure i ?-.2 \ Outline of the events involved. in nucleotide 
excision repair in eukaryotes. 

The endonucleases that remove the damaged region make 
cuts specifically at the junction between single-stranded and 
double-stranded regions of a DNA molecule. The DNA is 
therefore thought to melt either side of the damaged 
nucleotide, as shown in the diagram, possibly due to the 
helicase activity of TFIIH. 



(Dcm), which converts cytosines to 5-methylcytosines in 
5'-CCAGG-3 f and 5'-CCTGG-3'. These methylations are 
not mutagenic, the modified nucleotides having the same 
base-pairing properties as the unmodified versions. There 
is delay between DNA replication and methylation of the 
daughter strand, and it is during this window of oppor- 
tunity that the repair system scans the DNA for mis- 
matches and makes the required corrections in the 
undermethylated, daughter strand {Figure 13.22). 

E. coli has at least three mismatch repair systems, called 
'long patch', 'short patch and 'very short patch', the 
names indicating the relative lengths of the excised and 
resynthesized segments. The long patch system replaces 
up to a kb or more of DNA and requires the MutH, MutL 
and MutS proteins, as well as the DNA helicase II that we 
met during nucleotide excision repair. MutS recognizes 
the mismatch and MutH distinguishes the two strands by 
binding to unmethylated 5'-GATC-3' sequences {Figure 
13.23). The role of MutL is unclear but it might coordinate 
the activities of the two other proteins so that MutH binds 
to 5'-GATC-3' sequences only in the vicinity of mismatch 
sites recognized by MutS. After binding, MutH cuts the 



Methyl groups 



PARENT MOLECULE 
Fully methylated 




DAUGHTER MOLECULES 
New DNA not yet 
methylated 



i i 



DAUGHTER MOLECULES 
Fully methylated 



Figure '3.12 Methylation of newly-synthesized DNA in £. 
coii does not occur immediately after replication, providing a 
window of opportunity for the mismatch repair proteins to 
recognize the daughter strands and correct r ep| ica ^°^il?.r- S - 



phosphodiester bond immediately upstream of the G in 
the methylation sequence and DNA helicase II detaches 
the single strand. There does not appear to be an enzyme 
that cuts the strand downstream of the mismatch; instead 
the detached single-stranded region is degraded by an 
exonuclease that follows the helicase and continues 
beyond the mismatch site. The gap is then filled in by 
DNA polymerase I and DNA ligase. 

Similar events are thought to occur during short and 
very short mismatch repair, the difference being the speci- 
ficities of the proteins that recognize the mismatch. The 
short patch system, which results in excision of a segment 



13.2 RECOMBINATION 353 



Mismatch in daughter strand 

r / r 

I I I M I I I I rrn- 

3' / 5' 

| Attachment of MutH and MutS 

vH vj 

I I I I □ 1 M I I I I M I II M 1 M 



MutH cuts the DNA 



4H v S 

I M M I I U M I I I I I 1 I I HIE 



Strand detachment and removal 



Exonuclease 
/ DNA hellcase II 



I I I I I I I n 



. i i 



tii 1 1 n i 



13.2 RECOMBINATION 

Without recombination, genomes would be relatively sta- 
tic structures, undergoing very little change. Over a long 
period of time the gradual accumulation of mutations 
would result in small scale alterations in the nucleotide 
sequence of the genome, but more extensive restructur- 
ing, which is the role of recombination, would not occur. 
The evolutionary potential of the genome would be 
severely restricted. 

Recombination was first recognized as the process 
responsible for crossing-over and exchange of DNA 
segments between homologous chromosomes during 
meiosis of eukaryotic cells (see Figure 2.10 f p. 26), and 
subsequently implicated in the integration of transferred 
DNA into bacterial genomes after conjugation, transduc- 
tion or transformation (Section 2.3.2). The biological 
importance of these processes stimulated the first attempts 
to describe the molecular events involved in recombina- 
tion and led to the Holliday model (Holliday, 1964), with 
which we will begin our study of recombination. 



i 

i m 1 1 1 1 i r . ........ i n i ii 



| DNA polymerase I + DNA ligase 

TTTT I 1 I 1 I II I I l l l l l l l l J 

Long patch mismatch repair in £ colt. 
See the text for details. 



less than 10 nucleotides in length, begins when MutY rec- 
ognizes an A-G or A-C mismatch, and the very short 
repair system corrects G-T mismatches which are recog- 
nized by the Vsr endonuclease. 

Eukaryotes have homologs of the E. coli Mut proteins 
and their mismatch repair processes probably work in a 
similar way. The one difference is that methylation might 
not be the method used to distinguish between the parent 
and daughter polynucleotides (Modrich and Lahue, 
1996). Methylation has been implicated in mismatch 
repair in mammalian cells, but the DNA of some eukary- 
otes, including fruit flies and yeast, is not extensively 
methylated (see Box 8.1, p. 177); it is thought that these 
organisms must therefore use a different method. Possi- 
bilities include an association between the repair 
enzymes and the replication complex, so that repair is 
coupled with DNA synthesis, or use of single-strand 
binding proteins that mark the parent strand. 



The Holliday model refers to a type of recombination 
called general or homologous recombination. This is the 
most important version of recombination in nature, being 
responsible for meiotic crossing-over and integration of 
transferred DNA into bacterial genomes. 

The Holliday model describes recombination between 
two homologous double-stranded molecules, ones with 
identical or nearly identical sequences, but is equally 
applicable to two different molecules that share a limited 
region of homology, or a single molecule that recombines 
with itself because it contains two separate regions that 
are homologous with one another. 

The central feature of the model is formation of a het- 
eroduplex resulting from the exchange of polynucleotide 
segments between the two homologous molecules {Figure 
13.24). The heteroduplex is initially stabilized by base- 
pairing between each transferred strand and the intact 
polynucleotide of the recipient molecule, this base-pairing 
being possible because of the sequence similarity between 
the two molecules. Subsequently the gaps are sealed by 
DNA ligase, giving a Holliday structure. This structure is 
dynamic, branch migration resulting in exchange of 
longer segments of DNA being possible if the two helices 
rotate in the same direction. 

Separation, or resolution, of the Holliday structure 
back into individual double-stranded molecules occurs 
by cleavage across the branchpoint. This is the key to the 
entire process because the cut can be made in either of 
two orientations, as becomes apparent when the three 
dimensional configuration or chi form of the Holliday 
structure is examined (see Figure 13.24). These two cuts 
have very different results. If the cut is made left-right 



